fix(api): add security sentinel ingest compatibility endpoint#1
fix(api): add security sentinel ingest compatibility endpoint#1
Conversation
|
Superseded by #2 (clean main-based branch for the same fix). |
There was a problem hiding this comment.
Pull request overview
This PR adds a Security Sentinel compatibility endpoint (POST /ingest) to the FastCode API so that OmniLore's white-label gateway can forward security events without falling back to vector-store mode. It also refactors file-ignore logic in the repository loader to use pathspec/GitWildMatchPattern directly (instead of calling the utility helper per file), and expands ignore patterns and modifies several global config defaults.
Changes:
- Added
POST /ingestendpoint with an in-memory bounded buffer for OmniLore Security Sentinel compatibility - Refactored
RepositoryLoader.scan_files()to build a singlePathSpecobject for repo-relative ignore matching, replacing per-callshould_ignore_pathutility usage - Updated
config/config.yaml: added many new ignore patterns, removed severalsupported_extensions, and changedenable_agency_modeanduse_llm_enhancementdefaults
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
api.py |
Adds SecurityEventIngestRequest model, security_event_buffer global, and POST /ingest handler; surfaces security_ingest_enabled in health check |
fastcode/loader.py |
Replaces per-file should_ignore_path() calls with a single PathSpec built at scan start; drops should_ignore_path import |
config/config.yaml |
Adds OmniLore/tooling-specific ignore patterns; removes .md, .txt, .rst, .json, .html, .css, .xml from supported extensions; flips enable_agency_mode and use_llm_enhancement to off |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| total_size = 0 | ||
| max_file_size_bytes = self.max_file_size_mb * 1024 * 1024 | ||
|
|
||
| ignore_spec = PathSpec.from_lines(GitWildMatchPattern, effective_ignore) |
There was a problem hiding this comment.
The variable effective_ignore used on this line is never defined anywhere in scan_files or the surrounding scope. This will raise a NameError at runtime every time scan_files() is called, completely breaking repository scanning. It should be replaced with self.ignore_patterns, which is the instance attribute populated from config and was used by the previous should_ignore_path calls.
| ignore_spec = PathSpec.from_lines(GitWildMatchPattern, effective_ignore) | |
| ignore_spec = PathSpec.from_lines(GitWildMatchPattern, self.ignore_patterns) |
| OmniLore white-label tooling posts security events here when configured with | ||
| OMNILORE_SECURITY_SENTINEL_URL=http://127.0.0.1:8001. | ||
| """ | ||
| fastcode = _ensure_fastcode_initialized() |
There was a problem hiding this comment.
Calling _ensure_fastcode_initialized() here forces lazy initialization of the full FastCode system (which loads models, configuration, etc.) on every security ingest call, even when the ingest endpoint is purely a compatibility shim that buffers events. The fastcode local variable is only used to read repo_loaded and repo_indexed in the response — fields which could simply default to False (or be omitted) when the system hasn't been initialized yet. Consider using fastcode_instance directly (checking for None first) rather than triggering expensive initialization on what is meant to be a lightweight ingest path.
| security_event_buffer.append(record) | ||
| if len(security_event_buffer) > SECURITY_EVENT_BUFFER_LIMIT: | ||
| security_event_buffer.pop(0) |
There was a problem hiding this comment.
Using a plain list with pop(0) as a bounded buffer is O(n) for every eviction, because shifting all remaining elements. With the default buffer limit of 500 entries this is acceptable, but under a burst of security events (e.g., from a misconfigured sentinel) the per-call cost accumulates. The idiomatic and efficient Python structure for a bounded FIFO is collections.deque(maxlen=SECURITY_EVENT_BUFFER_LIMIT), which makes both append and eviction O(1).
| - .toml | ||
| - .md | ||
| - .txt | ||
| - .yaml |
There was a problem hiding this comment.
The removal of .md, .txt, .rst, .json, .html, .css, and .xml from supported_extensions is a broad behavioral change that affects all repository indexing, not just OmniLore/Security Sentinel use cases. This means documentation files (READMEs, RST docs), configuration/data files (JSON), and web assets (HTML, CSS, XML) will no longer be indexed for any user of this service. The PR description frames this PR as adding a compatibility endpoint for OmniLore security sentinel calls, but this extension removal is a significant unrelated regression that reduces the system's utility. These changes should either be reverted or explicitly justified and scoped.
| @@ -129,8 +165,8 @@ query: | |||
| detect_intent: true # Detect query type (how/what/where/debug/implement) | |||
|
|
|||
| # LLM-Enhanced Processing | |||
| use_llm_enhancement: true # Enable LLM-based query understanding | |||
| llm_enhancement_mode: "always" # Options: "adaptive", "always", "off" | |||
| use_llm_enhancement: false # Disable LLM rewrite for deterministic local routing | |||
| llm_enhancement_mode: "off" # Options: "adaptive", "always", "off" | |||
There was a problem hiding this comment.
Changing enable_agency_mode from true to false and use_llm_enhancement / llm_enhancement_mode to false / "off" are global defaults that affect all users of this service, not just OmniLore configurations. These changes will degrade retrieval quality for all existing deployments that relied on the previous defaults. The PR description mentions "prefer deterministic retrieval for local stability" but this is not mentioned in the stated purpose of the PR (adding a security sentinel ingest compatibility endpoint). These config changes should be clearly justified or made opt-in rather than changed as global defaults.
| logger.warning( | ||
| "Security ingest accepted (compat): event_type=%s tenant_id=%s", | ||
| event_type, | ||
| tenant_id, | ||
| ) |
There was a problem hiding this comment.
Every security ingest is logged at WARNING level. A successful, routine ingest is not a warning condition — WARNING level is typically reserved for unexpected or degraded states. Routine events like this should use logger.info instead, to avoid polluting warning-level monitoring and alerting. The existing pattern throughout the rest of api.py (lines 284, 307, etc.) uses logger.info for successful operations.
Summary
POST /ingestcompatibility endpoint for OmniLore Security Sentinel calls/healthWhy
OmniLore white-label gateway points security sentinel URL at
127.0.0.1:8001(FastCode). Without/ingest, events fell back to vector-store mode. This PR restores native ingest behavior.Validation
POST /ingestreturnsstatus=receivedsecurity_sentinel_ingestreturns HTTP 200 (no fallback)